Hashing

Definition:

Hashing is a process that transforms input data (or a message) into a fixed-size string of characters, which is typically a sequence of numbers and letters. The output, known as the hash value or hash code, is generated by a hash function. Hashing is widely used in various applications, including data integrity verification, password storage, and digital signatures.

Characteristics:

Deterministic:
- The same input will always produce the same hash output. This property allows for consistent verification of data integrity.
Fixed Size:
- Regardless of the size of the input data, the output hash will always be of a fixed length, making it easier to handle and compare.
Fast Computation:
- Hash functions are designed to compute hash values quickly, allowing for efficient data processing.
Pre-image Resistance:
- Given a hash output, it should be computationally infeasible to reverse-engineer the original input, ensuring data confidentiality.
Collision Resistance:
- It should be difficult to find two different inputs that produce the same hash output, preventing data tampering.

Common Hashing Algorithms:

MD5 (Message Digest 5):
- Produces a 128-bit hash value and is widely used for checksums and data integrity. However, it is no longer considered secure against collision attacks.
SHA-1 (Secure Hash Algorithm 1):
- Produces a 160-bit hash value. Like MD5, SHA-1 has vulnerabilities and is not recommended for security-sensitive applications.
SHA-256:
- Part of the SHA-2 family, it produces a 256-bit hash value and is widely used in security applications and protocols, including SSL/TLS and Bitcoin.
bcrypt:
- A hashing function designed specifically for hashing passwords, incorporating a salt to protect against rainbow table attacks.

Time Complexity:

Hash Computation Time: $O(n)$
The time complexity for computing a hash value depends on the input size, with most hashing algorithms running in linear time relative to the input length.

Space Complexity:

Space Complexity: $O(1)$
The output size of a hash function is constant (fixed size), independent of the input size, leading to constant space complexity for storing hash values.

C++ Implementation of Hashing (Using SHA-256):

#include <iostream>
#include <openssl/sha.h>
#include <iomanip>
#include <sstream>

std::string sha256(const std::string &data) {
    unsigned char hash[SHA256_DIGEST_LENGTH];
    SHA256(reinterpret_cast<const unsigned char *>(data.c_str()), data.size(), hash);

    std::ostringstream oss;
    for (const auto &byte : hash) {
        oss << std::hex << std::setw(2) << std::setfill('0') << static_cast<int>(byte);
    }
    return oss.str();
}

int main() {
    std::string data = "Hello, World!";
    std::string hashValue = sha256(data);

    std::cout << "Data: " << data << std::endl;
    std::cout << "SHA-256 Hash: " << hashValue << std::endl;

    return 0;
}

Summary:

Hashing is a fundamental technique used in computer science and cryptography for data integrity verification, password management, and digital signatures. The use of secure hash functions, such as SHA-256, provides strong security guarantees against common attacks. Understanding hashing and its properties is essential for implementing secure systems and applications.

Definition:​

Characteristics:​

Common Hashing Algorithms:​

Time Complexity:​

Space Complexity:​

C++ Implementation of Hashing (Using SHA-256):​

Summary:​